21 research outputs found

    A Clustering-Based Algorithm for Data Reduction

    Get PDF
    Finding an efficient data reduction method for large-scale problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods

    Support Vector Selection for Regression Machines

    Get PDF
    In this paper, we propose a method to select support vectors to improve the performance of support vector regression machines. First, the orthogonal least-squares method is adopted to evaluate the support vectors based on their error reduction ratios. By selecting the representative support vectors, we can obtain a simpler model which helps avoid the over-fitting problem. Second, the simplified model is further refined by applying the gradient descent method to tune the parameters of the kernel functions. Learning rules for minimizing the regularized risk functional are derived. Experimental results have shown that our approach can improve effectively the generalization capability of support vector regressors

    A Symbolic Logic Approach of Deriving Initial Neural Network Configurations for Supervised Classification

    No full text
    One of the problems encountered in neural network applications is the choice of a suitable initial neural network configuration for the given classification problem. We propose an idea of constructing initial neural network configurations by making use of decision trees and threshold logic. First, a decision tree is constructed from the given set of training patterns. Then the decision tree is translated into a neural network. Initial values for the weights and thresholds of the neural network are determined. Finally, the obtained neural network is trained by the back-propagation algorithm. Experimental results have shown that a neural network constructed in this manner learns fast and performs efficiently

    Time Series Forecasting with Missing Values

    No full text
    Time series prediction has become more popular in various kinds of applications such as weather prediction, control engineering, financial analysis, industrial monitoring, etc. To deal with real-world problems, we are often faced with missing values in the data due to sensor malfunctions or human errors. Traditionally, the missing values are simply omitted or replaced by means of imputation methods. However, omitting those missing values may cause temporal discontinuity. Imputation methods, on the other hand, may alter the original time series. In this study, we propose a novel forecasting method based on least squares support vector machine (LSSVM). We employ the input patterns with the temporal information which is defined as local time index (LTI). Time series data as well as local time indexes are fed to LSSVM for doing forecasting without imputation. We compare the forecasting performance of our method with other imputation methods. Experimental results show that the proposed method is promising and is worth further investigations

    A Weight-Based Clustering Method

    No full text
    This paper proposes a weight-based self-constructing clustering method for time series data. Self-constructing clustering processes all the data points incrementally. If a data point is not similar enough to an existing cluster, then (1) if the point currently does not belong to any cluster, it forms a new cluster of its own; (2) otherwise, the point is removed from the cluster it currently belongs to before a new cluster is formed. However, if a data point is similar enough to an existing cluster, then (1) if the point currently does not belong to any cluster, it is added to the most similar cluster; (2) otherwise, it is removed from the cluster it currently belongs to and added to the most similar cluster. During the clustering process, weights are learned and considered in the calculations of similarity between data points and clusters. Experimental results show that our proposed approach performs more effectively than other methods for real world time series datasets
    corecore